Skip to main content

More right-solid
More right-solid
More right-solid
More right-solid
SHOW DETAILS
up-solid down-solid
eye
Title
Date Reviewed
Review
Web Crawls
collection
13,744,263
ITEMS
36.8B
VIEWS
-
collection
eye 36.8B

The Web Archive of the Internet Archive started in late 1996, is made available through the Wayback Machine , and some collections are available in bulk to researchers. Many pages are archived by the Internet Archive for other contributors including partners of Archive-IT , and Save Page Now users. Other captures are donated to the Internet Archive by other partners such as Alexa Internet .
Topic: Web Archive

Internet Archive Web Crawls
collection
1,299,303
ITEMS
19.3B
VIEWS
-
collection
eye 19.3B

The Internet Archive discovers and captures web pages through many different web crawls. At any given time several distinct crawls are running, some for months, and some every day or longer. View the web archive through the Wayback Machine .
Topic: webwidecrawl

Worldwide Web Crawls
collection
612,848
ITEMS
8.3B
VIEWS
-
collection
eye 8.3B

Wide crawls of the Internet conducted by Internet Archive. Please visit the Wayback Machine to explore archived web sites. Since September 10th, 2010, the Internet Archive has been running Worldwide Web Crawls of the global web, capturing web elements, pages, sites and parts of sites. Each Worldwide Web Crawl was initiated from one or more lists of URLs that are known as "Seed Lists". Descriptions of the Seed Lists associated with each crawl may be provided as part of the metadata for...

Alexa Crawls
collection
207,942
ITEMS
7.4B
VIEWS
-
collection
eye 7.4B

Starting in 1996, Alexa Internet has been donating their crawl data to the Internet Archive. Flowing in every day, these data are added to the Wayback Machine after an embargo period.
Topics: web crawl, Alexa

Audio Archive
collection
6,791,860
ITEMS
6.3B
VIEWS
-
collection
eye 6.3B

Download or listen to free music and audio This library contains recordings ranging from alternative news programming, to Grateful Dead concerts, to Old Time Radio shows, to book and poetry readings, to original music uploaded by our users. Many of these audios and MP3s are available for free download. Check our FAQ for more information . Contribute Your Audio Please feel free to upload your audio (Uploaders, please set a Creative Commons license as part of the upload process, so people know...
Topic: Audio

Images
collection
3,527,745
ITEMS
5.8B
VIEWS
-
collection
eye 5.8B

This library contains digital images uploaded by Archive users which range from maps to astronomical imagery to photographs of artwork. Many of these images are available for free download.
Topic: images

eBooks and Texts
collection
21,369,335
ITEMS
5.7B
VIEWS
-
collection
eye 5.7B

The Internet Archive offers over 15,000,000  freely downloadable books and texts. There is also a collection of 550,000 modern eBooks that may be borrowed by anyone with a free archive.org account. Borrow a Book Books on Internet Archive are offered in many formats, including DAISY files intended for print disabled people.  In addition to the collections here, print disabled people may access a large collection of modern books provided as encrypted DAISY files on...
Topic: Texts, Kindle, Ebook, Nook, Books

Survey Crawls
collection
96,576
ITEMS
4.7B
VIEWS
-
collection
eye 4.7B

Survey crawls are run about twice a year, on average, and attempt to capture the content of the front page of every web host ever seen by the Internet Archive since 1996.
Topic: survey crawls

Cover Art Archive
collection
93
ITEMS
4.4B
VIEWS
-
collection
eye 4.4B

To see or download images please visit MusicBrainz . The Cover Art Archive is a joint project between the Internet Archive and MusicBrainz , whose goal is to make cover art images available to everyone on the Internet in an organised and convenient way. Images in the archive are curated by the MusicBrainz community and go through a peer review process to ensure that they are correct, free of spam and of the best quality. If you would like to contribute cover art, create a MusicBrainz account...

Moving Image Archive
collection
5,066,828
ITEMS
4.2B
VIEWS
-
collection
eye 4.2B

Download or listen to free movies, films, and videos This library contains digital movies uploaded by Archive users which range from classic full-length films, to daily alternative news broadcasts, to cartoons and concerts. Many of these videos are available for free download. Check our FAQ for more information . Contribute Your Movies and Video Please feel free to upload your movies (Uploaders, please set a Creative Commons license as part of the upload process, so people know what they can do...
Topic: Moving Images

Live Web Proxy Crawls
collection
62,954
ITEMS
4.2B
VIEWS
-
collection
eye 4.2B

Content crawled via the Wayback Machine Live Proxy mostly by the Save Page Now feature on web.archive.org. Liveweb proxy is a component of Internet Archive’s wayback machine project. The liveweb proxy captures the content of a web page in real time, archives it into a ARC or WARC file and returns the ARC/WARC record back to the wayback machine to process. The recorded ARC/WARC file becomes part of the wayback machine in due course of time.

Community Audio
collection
2,002,520
ITEMS
4.1B
VIEWS
-
collection
eye 4.1B

You are invited to view or upload audios to the Community collection. These thousands of recordings were all contributed by Archive users and community members. Please select a Creative Commons License during upload so that others will know what they may (or may not) do with with your audio. Click here to contribute your audio ! Browse by style: Blues , Country , Electronic , Experimental , Hiphop , Indie , Jazz , Rock , Spoken Word .

Community Video
collection
810,472
ITEMS
2.3B
VIEWS
-
collection
eye 2.3B

You are invited to view or upload your videos to the Community collection. These thousands of videos were contributed by Archive users and community members. These videos are available for free download. Please select a Creative Commons License during upload so that others will know what they may (or may not) do with with your video. Click here to upload your video !
Topic: Moving Images

Archive-It Digital Collection
collection
2,335,287
ITEMS
2.2B
VIEWS
-
collection
eye 2.2B

Archive-It is a subscription web archiving service of the Internet Archive that helps organizations harvest, build, and preserve collections of digital content. Partners create domain specific collections of web captures that can be searched on Archive It . Content is hosted and stored at the Internet Archive data centers. Archive-It works with more than 400 partner organizations in 48 U.S. states and 16 countries worldwide including: College and University Libraries State Archives, Libraries,...
Topic: Colleges, Universities, Libraries, Archives, NGOs, Museums

Archive-It Partners
collection
2,330,828
ITEMS
2.2B
VIEWS
-
collection
eye 2.2B

Archive-It is the leading web archiving service for collecting and accessing cultural heritage on the web and is a service of Internet Archive used by libraries, archives, governments, non-profits, and other organizations to build collections of web materials.
Topic: TK

Focused Crawls
collection
270,020
ITEMS
1.8B
VIEWS
-
collection
eye 1.8B

Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
Topic: webcrawl

Community Texts
collection
811,319
ITEMS
1.7B
VIEWS
-
collection
eye 1.7B

These books are books contributed by the community. Click here to contribute your book ! For more information and how-to please see help.archive.org/hc/en-us/articles/360002360111-Uploading-A-Basic-Guide Uploaders, please note: Archive.org supports metadata about items in just about any language so long as the characters are UTF8 encoded Find books by language: Afar Books Afrikaans Books Akan Books Albanian Books Arabic Books Armenian Books Aymara Books Azerbaijan Books Balochi Books Bambara...
Topic: Texts

American Libraries
collection
2,996,760
ITEMS
1.7B
VIEWS
-
collection
eye 1.7B

The American Libraries collection includes material contributed from across the United States. Institutions range from the Library of Congress to many local public libraries. As a whole, this collection of material brings holdings that cover many facets of American life and scholarship into the public domain. Significant portions of this collection have been generously sponsored by Microsoft , Yahoo! , The Sloan Foundation , and others.

Community Media
collection
339,390
ITEMS
1.6B
VIEWS
-
collection
eye 1.6B

A collection of media donated by individuals to the Internet Archive.

Fix Broken Links Web Crawls
collection
100,260
ITEMS
1.6B
VIEWS
-
collection
eye 1.6B

These crawls are part of an effort to archive pages as they are created and archive the pages that they refer to. That way, as the pages that are referenced are changed or taken from the web, a link to the version that was live when the page was written will be preserved. Then the Internet Archive hopes that references to these archived pages will be put in place of a link that would be otherwise be broken, or a companion link to allow people to see what was originally intended by a page's...

Archive Team
collection
536,679
ITEMS
1.5B
VIEWS
-
collection
eye 1.5B

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history. History is littered with hundreds of conflicts over the future of a community, group, location or...

Top Domains
collection
136,288
ITEMS
1.2B
VIEWS
-
collection
eye 1.2B

A daily collection of thousands of the most popular web sites according to Alexa.com's top sites rankings .
Topics: daily, popular sites, Alexa

Audio Books & Poetry
collection
18,175
ITEMS
1.2B
VIEWS
-
collection
eye 1.2B

Listen to free audio books and poetry recordings! This library of audio books and poetry features digital recordings and MP3's from the Naropa Poetics Audio Archive, LibriVox, Project Gutenberg, Maria Lectrix, and Internet Archive users.

The LibriVox Free Audiobook Collection
collection
13,165
ITEMS
1.1B
VIEWS
-
collection
eye 1.1B

LibriVox - founded in 2005 - is a community of volunteers from all over the world who record public domain texts: poetry, short stories, whole books, even dramatic works, in many different languages. All LibriVox recordings are in the public domain in the USA and available as free downloads on the internet. If you are not in the USA, please check your country's copyright law before downloading. Please visit the LibriVox website where you can search for books that interest you. You can search or...

collection
eye 1.1B

The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.

collection
eye 1.1B

The seed for Wide00014 was: - Slash pages from every domain on the web: -- a list of domains using Survey crawl seeds -- a list of domains using Wide00012 web graph -- a list of domains using Wide00013 web graph - Top ranked pages (up to a max of 100) from every linked-to domain using the Wide00012 inter-domain navigational link graph -- a ranking of all URLs that have more than one incoming inter-domain link (rank was determined by number of incoming links using Wide00012 inter domain links)...

Custom Crawl Services
collection
102,706
ITEMS
986.2M
VIEWS
-
collection
eye 986.2M

Large-scale web harvests and national domain crawls performed for National Libraries, National Archives, preservation partners, research initiatives, and as part of special projects and custom crawling and research services.
Topic: ccs

ArchiveBot: The Archive Team Crowdsourced Crawler
collection
18,185
ITEMS
913.9M
VIEWS
-
collection
eye 913.9M

ArchiveBot is an IRC bot designed to automate the archival of smaller websites (e.g. up to a few hundred thousand URLs). You give it a URL to start at, and it grabs all content under that URL, records it in a WARC, and then uploads that WARC to ArchiveTeam servers for eventual injection into the Internet Archive (or other archive sites). To use ArchiveBot, drop by #archivebot on EFNet. To interact with ArchiveBot, you issue commands by typing it into the channel. Note you will need channel...
Topics: archiveteam, archivebot, webcrawl, robot, love

Wide Crawl started April 2013
collection
25,035
ITEMS
806.4M
VIEWS
-
collection
eye 806.4M

Web wide crawl with initial seedlist and crawler configuration from April 2013.

web-group-internal
collection
32,532
ITEMS
805.6M
VIEWS
-
collection
eye 805.6M

miscellaneous data
Topic: brad tofel

Wiki Collections
collection
2,063,552
ITEMS
794.6M
VIEWS
-
collection
eye 794.6M

Collections of Wiki data
Topics: crawls, data, wiki

Wikipedia Outlinks
collection
40,513
ITEMS
789.9M
VIEWS
-
collection
eye 789.9M

Crawl of outlinks from wikipedia.org . These files are currently not publicly accessible. from Wikipedia : Wikipedia is a multilingual, web-based, free-content encyclopedia project operated by the Wikimedia Foundation and based on an openly editable model. The name "Wikipedia" is a portmanteau of the words wiki (a technology for creating collaborative websites, from the Hawaiian word wiki, meaning "quick") and encyclopedia. Wikipedia's articles provide links to guide the...

alexa_2007
collection
7,636
ITEMS
788.7M
VIEWS
-
collection
eye 788.7M

this data is currently not publicly accessible.

collection
eye 759.4M

The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.

Wikipedia Near Real Time (from IRC)
collection
18,235
ITEMS
736.9M
VIEWS
-
collection
eye 736.9M

This is a collection of web page captures from links added to, or changed on, Wikipedia pages. The idea is to bring a reliability to Wikipedia outlinks so that if the pages referenced by Wikipedia articles are changed, or go away, a reader can permanently find what was originally referred to. This is part of the Internet Archive's attempt to rid the web of broken links .
Topics: Wikipedia, Wikimedia

Additional Collections
collection
13,150,021
ITEMS
693.2M
VIEWS
-
collection
eye 693.2M

Additional collections of scanned books, articles, and other texts (usually organized by topic) are presented here.

Wide Crawl started June 2014
collection
45,341
ITEMS
683.3M
VIEWS
-
collection
eye 683.3M

Web wide crawl with initial seedlist and crawler configuration from June 2014.

Wayback Indexes
collection
554
ITEMS
682.1M
VIEWS
-
collection
eye 682.1M

Wayback indexes. This data is currently not publicly accessible.

Wide Crawl Number 12 - started March, 14th 2015
collection
49,621
ITEMS
681.7M
VIEWS
-
collection
eye 681.7M

Web wide crawl with initial seedlist and crawler configuration from January 2015.

collection
eye 602.8M

Web wide crawl.

collection
eye 555.1M

The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.

Ourmedia
collection
342,406
ITEMS
529.9M
VIEWS
-
collection
eye 529.9M

Welcome to the Ourmedia, an initiative devoted to creating and sharing works of personal media. Video blogs, photo albums, original music, documentary journalism, music videos, children's tales, Flash animations, student films -- all kinds of digital works have begun to flourish as the Web matures into a rich multimedia network. The Ourmedia project was started by members of the creative and technology communities in the summer of 2004 as a way of advancing the spread of personal media. Our...

Wide Crawl started August 2013
collection
21,932
ITEMS
522.3M
VIEWS
-
collection
eye 522.3M

Web wide crawl with initial seedlist and crawler configuration from August 2013.

Arts & Music
collection
15,668
ITEMS
518.6M
VIEWS
-
collection
eye 518.6M

This library of arts and music videos features This or That (a burlesque game show), the Coffee House TV arts program, punk bands from Punkcast and live performances from Groove TV. Many of these movies are available for free download.

alexa_2006
collection
6,507
ITEMS
505.8M
VIEWS
-
collection
eye 505.8M

this data is currently not publicly accessible.

Electric Sheep
collection
564
ITEMS
505.1M
VIEWS
-
collection
eye 505.1M

Electric Sheep is a distributed computing project for animating and evolving fractal flames, which are in turn distributed to the networked computers, which display them as a screensaver. Process The process is transparent to the casual user, who can simply install the software as a screensaver. Alternatively, the user may become more involved with the project, manually creating a fractal flame file for upload to the server where it is rendered into a video file of the animated fractal flame....
Topic: electric sheep

Wide Crawl Number 13
collection
46,050
ITEMS
484M
VIEWS
-
collection
eye 484M

Web Wide Crawl Number 13

-
collection
eye 483M

The seeds for this crawl came from: 251 million Domains that had at least one link from a different domain in the Wayback Machine, across all time ~ 300 million Domains that we had in the Wayback, across all time 55,945,067 Domains from https://archive.org/details/wide00016 This crawl was run with a Heritrix setting of "maxHops=0" (URLs including their embeds) The WARC files associated with this crawl are not currently available to the general public.

Live Music Archive
collection
200,462
ITEMS
479.8M
VIEWS
-
collection
eye 479.8M

Browse: all artists · this day in history · average review rating · number reviews · date reviewed · number views The Live Music Archive is a community committed to providing the highest quality live concerts in a lossless, downloadable format, along with the convenience of on-demand streaming. In 2002, the Internet Archive teamed up with etree.org to create the Live Music Archive in order to preserve and archive as many live concerts as possible for current and future generations to...
Topic: Live Music

Canadian Libraries
collection
595,436
ITEMS
475.2M
VIEWS
-
collection
eye 475.2M

Welcome to the Canadian Libraries page. The Toronto scanning centre was established in 2004 on the campus of the University of Toronto . From its humble beginnings, Internet Archive Canada has worked with well over 50 institutions, in providing their unique material(s) with open access and sharing these collections the world over. From the Archives of the Sisters of Service to the University of Alberta, IAC has digitized approximately 522,741 unique and special collections. Many...
Topic: Texts

GDELT
collection
53,056
ITEMS
464M
VIEWS
-
collection
eye 464M

A daily crawl of more than 200,000 home pages of news sites, including the pages linked from those home pages. Site list provided by The GDELT Project
Topics: GDELT, News

Wide Crawl started January 2012
collection
30,373
ITEMS
463.2M
VIEWS
-
collection
eye 463.2M

Web wide crawl with initial seedlist and crawler configuration from January 2012 using HQ software.

Top News
collection
98,973
ITEMS
451.9M
VIEWS
-
collection
eye 451.9M

A daily collection of hundreds of the world's top news sites.
Topics: daily, news

collection
eye 441M

Web wide crawl number 16 The seed list for Wide00016 was made from the join of the top 1 million domains from CISCO and the top 1 million domains from Alexa.

collection
eye 432M

The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.

Wikipedia Outlinks March 2016
collection
27,594
ITEMS
409.5M
VIEWS
-
collection
eye 409.5M

Crawl of outlinks from wikipedia.org started March, 2016. These files are currently not publicly accessible. Properties of this collection. It has been several years since the last time we did this. For this collection, several things were done: 1. Turned off duplicate detection. This collection will be complete, as there is a good chance we will share the data, and sharing data with pointers to random other collections, is a complex problem. 2. For the first time, did all the different wikis....

Wide Crawl started April 2012
collection
39,279
ITEMS
407.2M
VIEWS
-
collection
eye 407.2M

Web wide crawl with initial seedlist and crawler configuration from April 2012.

The Internet Archive Software Collection
collection
441,587
ITEMS
386.3M
VIEWS
-
collection
eye 386.3M

The Internet Archive Software Collection is the largest vintage and historical software library in the world, providing instant access to millions of programs, CD-ROM images, documentation and multimedia. The collection includes a broad range of software related materials including shareware, freeware, video news releases about software titles, speed runs of actual software game play, previews and promos for software games, high-score and skill replays of various game genres, and the art of...

collection
eye 368.7M

The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.

-
collection
eye 367.9M

Wide17 was seeded with the "Total Domains" list of 256,796,456 URLs provided by  Domains Index   on June 26th, and crawled with max-hops set to "3" and de-duplication set "on".   

Data Collection
collection
651,635
ITEMS
333.8M
VIEWS
-
collection
eye 333.8M

Wordpress Blogs and the Pages They Link To
collection
27,571
ITEMS
319.9M
VIEWS
-
collection
eye 319.9M

This is a collection of pages and embedded objects from WordPress blogs and the external pages they link to. Captures of these pages are made on a continuous basis seeded from a feed of new or changed pages hosted by Wordpress.com or by Wordpress pages hosted by sites running a properly configured Jetpack wordpress plugin.
Topics: Wordpress.com, blogs, jetpack

Archive Team: The News Roundup
collection
38,892
ITEMS
314.9M
VIEWS
-
collection
eye 314.9M

Archive Team now searches many, many news sites, including extensive worldwide and obscure sources, to capture unique news stories for history.

Survey Crawl Number 7
collection
6,605
ITEMS
309.9M
VIEWS
-
collection
eye 309.9M

This "Survey" crawl was started on Feb. 24, 2018. This crawl was run with a Heritrix setting of "maxHops=0" (URLs including their embeds) Survey 7 is based on a seed list of 339,249,218 URLs which is all the URLs in the Wayback Machine that we saw a 200 response code from in 2017 based on a query we ran on Feb. 1st, 2018.   The WARC files associated with this crawl are not currently available to the general public.

Wide Crawl started October 2010
collection
15,839
ITEMS
305.7M
VIEWS
-
collection
eye 305.7M

Web wide crawl with initial seedlist and crawler configuration from October 2010

Wide Crawl started February 2014
collection
9,806
ITEMS
302.1M
VIEWS
-
collection
eye 302.1M

Web wide crawl with initial seedlist and crawler configuration from February 2014.

Wide Crawl Started January 2013
collection
15,157
ITEMS
299.5M
VIEWS
-
collection
eye 299.5M

Wide crawls of the Internet conducted by Internet Archive. Access to content is restricted. Please visit the Wayback Machine to explore archived web sites.

Wide Crawl started September 2012
collection
22,423
ITEMS
294.1M
VIEWS
-
collection
eye 294.1M

Web wide crawl with initial seedlist and crawler configuration from September 2012.

alexa_web_2009
collection
3,080
ITEMS
289.2M
VIEWS
-
collection
eye 289.2M

this data is currently not publicly accessible.

Movies
collection
24,488
ITEMS
287.7M
VIEWS
-
collection
eye 287.7M

Watch full-length feature films, classic shorts, world culture documentaries, World War II propaganda, movie trailers, and films created in just ten hours: These options are all featured in this diverse library! Many of these videos are available for free download.

perma_cc
collection
718,867
ITEMS
277.5M
VIEWS
-
collection
eye 277.5M

alexa_web_2010
collection
2,994
ITEMS
272.9M
VIEWS
-
collection
eye 272.9M

this data is currently not publicly accessible.

Wide Crawl started October 2011
collection
12,648
ITEMS
272.3M
VIEWS
-
collection
eye 272.3M

Web wide crawl with initial seedlist and crawler configuration from March 2011 using HQ software.

University of Toronto - Robarts Library
collection
214,580
ITEMS
264.6M
VIEWS
-
collection
eye 264.6M

The John P. Robarts Research Library, commonly referred to as Robarts Library, is the main humanities and social sciences library of the University of Toronto Libraries and the largest individual library in the university. Opened in 1973 and named for John Robarts, the 17th Premier of Ontario, the library contains more than 4.5 million bookform items, 4.1 million microform items and 740,000 other items. The library building is one of the most significant examples of brutalist architecture in...

stream_only
collection
82,519
ITEMS
255.3M
VIEWS
-
collection
eye 255.3M

Stream-only collection

Wide Crawl started March 2011
collection
8,528
ITEMS
253.2M
VIEWS
-
collection
eye 253.2M

Web wide crawl with initial seedlist and crawler configuration from March 2011. This uses the new HQ software for distributed crawling by Kenji Nagahashi. What’s in the data set: Crawl start date: 09 March, 2011 Crawl end date: 23 December, 2011 Number of captures: 2,713,676,341 Number of unique URLs: 2,273,840,159 Number of hosts: 29,032,069 The seed list for this crawl was a list of Alexa’s top 1 million web sites, retrieved close to the crawl start date. We used Heritrix (3.1.1-SNAPSHOT)...

Around The World Crawl
collection
2,150
ITEMS
249.9M
VIEWS
-
collection
eye 249.9M

Data crawled by Sloan Foundation on behalf of Internet Archive

Spirituality & Religion
collection
111,354
ITEMS
247.5M
VIEWS
-
collection
eye 247.5M

View videos about spirituality and religion.

National Library of Australia Crawls
collection
33,826
ITEMS
247.2M
VIEWS
-
collection
eye 247.2M

Crawls performed by Internet Archive on behalf of the National Library of Australia. This data is currently not publicly accessible.

.com survey started January 2011
collection
2,535
ITEMS
247.1M
VIEWS
-
collection
eye 247.1M

Survey crawl of .com domains started January 2011.
Topic: webcrawl

Community Spirituality and Religion
collection
110,861
ITEMS
246.3M
VIEWS
-
collection
eye 246.3M

These religion and spirituality videos were contributed by Archive users.

California Digital Library
collection
191,918
ITEMS
225.9M
VIEWS
-
collection
eye 225.9M

The California Digital Library supports the assembly and creative use of the world's scholarship and knowledge for the University of California libraries and the communities they serve. In addition, the CDL provides tools that support the construction of online information services for research, teaching, and learning, including services that enable the UC libraries to effectively share their materials and provide greater access to digital content.

38_crawl
collection
1,387
ITEMS
217.1M
VIEWS
-
collection
eye 217.1M

this data is currently not publicly accessible.

Netlabels
collection
72,388
ITEMS
214.5M
VIEWS
Aug 9, 2007
collection
eye 214.5M

Welcome to the Netlabels collection at the Internet Archive . This collection hosts complete, freely downloadable/streamable, often Creative Commons -licensed catalogs of 'virtual record labels'. These 'netlabels' are non-profit, community-built entities dedicated to providing high quality, non-commercial, freely distributable MP3/OGG-format music for online download in a multitude of genres. Styles include: melodic electronica ( e.g. Observatory Online , Please Do Something ) minimal house (...

Wikipedia Outlinks February 2012
collection
2,951
ITEMS
200.3M
VIEWS
-
collection
eye 200.3M

Crawl of outlinks from wikipedia.org started February, 2012. These files are currently not publicly accessible.

Folksoundomy: A Library of Sound
collection
551,648
ITEMS
193.9M
VIEWS
-
collection
eye 193.9M

Folksonomy : A system of classification derived from the practice and method of collaboratively creating and managing tags to annotate and categorize content; this practice is also known as collaborative tagging, social classification, social indexing, and social tagging. Coined by Thomas Vander Wal, it is a portmanteau of folk and taxonomy. Folksoundomy : A collection of sounds, music and speech derived from the efforts of volunteers to make information as widely available as possible. Because...

Survey Crawl Number 8
collection
8,250
ITEMS
188.4M
VIEWS
-
collection
eye 188.4M

51_crawl
collection
1,138
ITEMS
181.2M
VIEWS
-
collection
eye 181.2M

this data is currently not publicly accessible.

Feature Films
collection
6,441
ITEMS
179.7M
VIEWS
-
collection
eye 179.7M

Feature films, shorts , silent films and trailers are available for viewing and downloading. Enjoy! View a list of all the Feature Films sorted by popularity . Do you want to post a feature film? First, figure out if it's in the Public Domain. Read this FAQ about determining if something is PD. If you're still not sure, post a question to the forum below with as much information about the movie as possible. One of our users might have relevant information.
Topic: Moving Images

Prelinger Archives
collection
7,392
ITEMS
176.8M
VIEWS
-
collection
eye 176.8M

View thousands of films from the Prelinger Archives! Prelinger Archives was founded in 1983 by Rick Prelinger in New York City. Over the next twenty years, it grew into a collection of over 60,000 "ephemeral" (advertising, educational, industrial, and amateur) films. In 2002, the film collection was acquired by the Library of Congress, Motion Picture, Broadcasting and Recorded Sound Division . Prelinger Archives remains in existence, holding approximately 11,000 digitized and...

52_crawl
collection
2,589
ITEMS
175.2M
VIEWS
-
collection
eye 175.2M

this data is currently not publicly accessible.

Alexa Crawl EG
collection
1,678
ITEMS
174.3M
VIEWS
-
collection
eye 174.3M

Crawl EG from Alexa Internet. This data is currently not publicly accessible.

web_wk
collection
9,973
ITEMS
171.7M
VIEWS
-
collection
eye 171.7M

Crawl performed by Internet Archive. This data is currently not publicly accessible.

National Library of Spain Crawls
collection
6,742
ITEMS
160.3M
VIEWS
-
collection
eye 160.3M

Data collected by Internet Archive on behalf of the National Library of Spain. This data is currently not publicly accessible.

United States Patent and Trademark Office documents
collection
409,692
ITEMS
159.8M
VIEWS
-
collection
eye 159.8M

United States Patent and Trademark Office documents contributed by Think Computer Foundation.
Topic: U.S Patent

web_iq
collection
2,637
ITEMS
157.9M
VIEWS
-
collection
eye 157.9M

Crawl performed by Internet Archive. This data is currently not publicly accessible.

European Libraries
collection
684,746
ITEMS
157M
VIEWS
-
collection
eye 157M

Scanned books from various European Libraries.

-
collection
eye 148.6M

The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.

News & Public Affairs
collection
918,580
ITEMS
147.5M
VIEWS
-
collection
eye 147.5M

An analysis of news and public affairs independent from traditional corporate media is available from this diverse video library. From Democracy Now's daily news program, to three days of TV news coverage following the 911 attacks, to Mosaic’s timely clips of Middle East newscasts, to UCSF's Tobacco Industry Videos: These collections offer an alternative way to view and interpret current news and public affairs. Many of these videos are available for free download.

26_crawl
collection
1,466
ITEMS
140.3M
VIEWS
-
collection
eye 140.3M

this data is currently not publicly accessible.